{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Implementation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This section shows how the linear regression extensions discussed in this chapter are typically fit in Python. First let's import the {doc}`Boston housing</content/appendix/data>` dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np \n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "from sklearn import datasets\n",
    "boston = datasets.load_boston()\n",
    "X_train = boston['data']\n",
    "y_train = boston['target']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Regularized Regression"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Both Ridge and Lasso regression can be easily fit using `scikit-learn`. A bare-bones implementation is provided below. Note that the regularization parameter `alpha` (which we called $\\lambda$) is chosen arbitrarily."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.linear_model import Ridge, Lasso\n",
    "alpha = 1\n",
    "\n",
    "# Ridge\n",
    "ridge_model = Ridge(alpha = alpha)\n",
    "ridge_model.fit(X_train, y_train)\n",
    "\n",
    "\n",
    "# Lasso\n",
    "lasso_model = Lasso(alpha = alpha)\n",
    "lasso_model.fit(X_train, y_train);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In practice, however, we want to choose `alpha` through cross validation. This is easily implemented in `scikit-learn` by designating a set of `alpha` values to try and fitting the model with `RidgeCV` or `LassoCV`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.linear_model import RidgeCV, LassoCV\n",
    "alphas = [0.01, 1, 100]\n",
    "\n",
    "# Ridge\n",
    "ridgeCV_model = RidgeCV(alphas = alphas)\n",
    "ridgeCV_model.fit(X_train, y_train)\n",
    "\n",
    "# Lasso\n",
    "lassoCV_model = LassoCV(alphas = alphas)\n",
    "lassoCV_model.fit(X_train, y_train);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can then see which values of `alpha` performed best with the following."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Ridge alpha: 0.01\n",
      "Lasso alpha: 1.0\n"
     ]
    }
   ],
   "source": [
    "print('Ridge alpha:', ridgeCV.alpha_)\n",
    "print('Lasso alpha:', lassoCV.alpha_)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bayesian Regression"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also fit Bayesian regression using `scikit-learn` (though another popular package is `pymc3`). A very straightforward implementation is provided below. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.linear_model import BayesianRidge\n",
    "bayes_model = BayesianRidge()\n",
    "bayes_model.fit(X_train, y_train);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is not, however, identical to our construction in the previous section since it infers the $\\sigma^2$ and $\\tau$ parameters, rather than taking those as fixed inputs. More information can be found [here](https://scikit-learn.org/stable/modules/linear_model.html#bayesian-regression). The hidden chunk below demonstrates a hacky solution for running Bayesian regression in `scikit-learn` using known values for $\\sigma^2$ and $\\tau$, though it is hard to imagine a practical reason to do so"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````{toggle}\n",
    "By default, Bayesian regression in `scikit-learn` treats $\\alpha = \\frac{1}{\\sigma^2}$ and $\\lambda = \\frac{1}{\\tau}$ as random variables and assigns them the following prior distributions\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "\\alpha &\\sim \\text{Gamma}(\\alpha_1, \\alpha_2) \n",
    "\\\\\n",
    "\\lambda &\\sim \\text{Gamma}(\\lambda_1, \\lambda_2).\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "Note that $E(\\alpha) = \\frac{\\alpha_1}{\\alpha_2}$ and $E(\\lambda) = \\frac{\\lambda_1}{\\lambda_2}$. To *fix* $\\sigma^2$ and $\\tau$, we can provide an extremely strong prior on $\\alpha$ and $\\lambda$, guaranteeing that their estimates will be approximately equal to their expected value.\n",
    "\n",
    "Suppose we want to use $\\sigma^2 = 11.8$ and $\\tau = 10$, or equivalently $\\alpha = \\frac{1}{11.8}$, $\\lambda = \\frac{1}{10}$. Then let\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "\\alpha_1 &= 10000 \\cdot \\frac{1}{11.8}, \\\\\n",
    "\\alpha_2 &= 10000, \\\\\n",
    "\\lambda_1 &= 10000 \\cdot \\frac{1}{10}, \\\\\n",
    "\\lambda_2 &= 10000.\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "This guarantees that $\\sigma^2$ and $\\tau$ will be approximately equal to their pre-determined values. This can be implemented in `scikit-learn` as follows\n",
    "\n",
    "```{code}\n",
    "big_number = 10**5\n",
    "\n",
    "# alpha\n",
    "alpha = 1/11.8\n",
    "alpha_1 = big_number*alpha\n",
    "alpha_2 = big_number\n",
    "\n",
    "# lambda \n",
    "lam = 1/10\n",
    "lambda_1 = big_number*lam\n",
    "lambda_2 = big_number\n",
    "\n",
    "# fit \n",
    "bayes_model = BayesianRidge(alpha_1 = alpha_1, alpha_2 = alpha_2, alpha_init = alpha,\n",
    "                     lambda_1 = lambda_1, lambda_2 = lambda_2, lambda_init = lam)\n",
    "bayes_model.fit(X_train, y_train);\n",
    "```\n",
    "\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Poisson Regression"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "GLMs are most commonly fit in Python through the `GLM` class from `statsmodels`. A simple Poisson regression example is given below.\n",
    "\n",
    "As we saw in the GLM concept section, a GLM is comprised of a random distribution and a link function. We identify the random distribution through the `family` argument to `GLM` (e.g. below, we specify the `Poisson` family). The default link function depends on the random distribution. By default, the Poisson model uses the link function\n",
    "\n",
    "$$\n",
    "\\eta_n = g(\\mu_n) = \\log(\\lambda_n),\n",
    "$$\n",
    "\n",
    "which is what we use below. For more information on the possible distributions and link functions, check out the `statsmodels` GLM [docs](https://www.statsmodels.org/stable/glm.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [],
   "source": [
    "import statsmodels.api as sm\n",
    "X_train_with_constant = sm.add_constant(X_train)\n",
    "\n",
    "poisson_model = sm.GLM(y_train, X_train, family=sm.families.Poisson())\n",
    "poisson_model.fit();"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Edit Metadata",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
